Search results for "document clustering"

showing 7 items of 7 documents

Benchmarking the sustainable manufacturing paradigm via automatic analysis and clustering of scientific literature: A perspective from Italian techno…

2019

Abstract The number of scientific papers in the field of Sustainable Manufacturing (SM) shows a strong growth of interest in this topic in the last 20 years. Despite this huge number of publications, a clear statement of the profound meaning of Sustainable Manufacturing, or at least a strong theoretical support, is still missing. The 6R framework seems to be a first attempt to rationalize this issue, as it is an axiomatic identification of its true nature. Recognizing the pursuing of one or more of the Reduce-Recycle-Reuse-Recover-Redesign-Remanufacture principles allows users to identify if any manufacturing action is in the right direction of sustainability. In the paper, the authors spec…

6R0209 industrial biotechnologyComputer scienceSustainable manufacturing02 engineering and technologyBenchmarkingScientific literatureData scienceIndustrial and Manufacturing EngineeringField (computer science)6R; Document clustering; Sustainable manufacturingIdentification (information)020303 mechanical engineering & transports020901 industrial engineering & automation0203 mechanical engineeringArtificial IntelligenceSustainabilityApplied researchDocument clusteringSettore ING-IND/16 - Tecnologie E Sistemi Di LavorazioneAxiomSustainable manufacturing 6R Document clusteringMeaning (linguistics)
researchProduct

DBSCAN Algorithm for Document Clustering

2019

Abstract Document clustering is a problem of automatically grouping similar document into categories based on some similarity metrics. Almost all available data, usually on the web, are unclassified so we need powerful clustering algorithms that work with these types of data. All common search engines return a list of pages relevant to the user query. This list needs to be generated fast and as correct as possible. For this type of problems, because the web pages are unclassified, we need powerful clustering algorithms. In this paper we present a clustering algorithm called DBSCAN – Density-Based Spatial Clustering of Applications with Noise – and its limitations on documents (or web pages)…

DBSCANInformation retrievalSimilarity (network science)Computer scienceWeb pageFeature selectionDocument clusteringCluster analysisData typeWord (computer architecture)International Journal of Advanced Statistics and IT&C for Economics and Life Sciences
researchProduct

A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics

2012

International audience; XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficient…

Document Structure DescriptionComputer Networks and Communicationscomputer.internet_protocolComputer scienceEfficient XML Interchange[SCCO.COMP]Cognitive science/Computer science0102 computer and information sciences02 engineering and technologycomputer.software_genre01 natural sciencesSemantic similarityXML Schema Editor020204 information systems0202 electrical engineering electronic engineering information engineeringXML schemacomputer.programming_languageInformation retrieval[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB][INFO.INFO-WB]Computer Science [cs]/Web[INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM]XML validationcomputer.file_formatDocument clusteringHuman-Computer InteractionXML frameworkTree (data structure)XML databaseTree structure010201 computation theory & mathematics[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]020201 artificial intelligence & image processingSemi-structured dataEdit distancecomputerSoftwareXMLXML CatalogData integration
researchProduct

Towards Responsible AI for Financial Transactions

2020

Author's accepted manuscript. © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The application of AI in finance is increasingly dependent on the principles of responsible AI. These principles-explainability, fairness, privacy, accountability, transparency and soundness form the basis for trust in future AI systems. In this empirical study, we address the first p…

FOS: Computer and information sciencesComputer Science - Machine LearningComputer scienceComputer Science - Artificial IntelligenceDecision tree02 engineering and technologyMachine learningcomputer.software_genreMachine Learning (cs.LG)Empirical research020204 information systems0202 electrical engineering electronic engineering information engineeringRobustness (economics)Categorical variableVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550Soundnessbusiness.industryDocument clusteringTransparency (behavior)ComputingMethodologies_PATTERNRECOGNITIONArtificial Intelligence (cs.AI)Financial transaction020201 artificial intelligence & image processingArtificial intelligencebusinesscomputer
researchProduct

ExtMiner : Combining multiple ranking and clustering algorithms for structured document retrieval

2006

This paper introduces ExtMiner, a platform and potential tool for information management in SMEs (small & medium-size enterprise), or for organizational workgroups. ExtMiner supports interactive and iterative clustering of documents. It provides users with a visual cluster and list views at the same time, supporting iterative search process. ExtMiner may also be applied as a platform for research on retrieval fusion, since it combines search, clustering and visualization algorithms. ExtMiner was evaluated with three document collections. Although the findings were encouraging the user interface and performance with large document repositories need further development. peerReviewed

Information managementdokumenttien hakumenetelmätklusterointiDocument retrievalInformation retrievalComputer scienceDocument clusteringXMLcomputer.software_genreRanking (information retrieval)document clusteringRankingHuman–computer information retrievalRelevance (information retrieval)Data miningUser interfaceDocument retrievalCluster analysiscomputer
researchProduct

Document Word Clouds: Visualising Web Documents as Tag Clouds to Aid Users in Relevance Decisions

2009

Περιέχει το πλήρες κείμενο Information Retrieval systems spend a great effort on determining the significant terms in a document. When, instead, a user is looking at a document he cannot benefit from such information. He has to read the text to understand which words are important. In this paper we take a look at the idea of enhancing the perception of web documents with visualisation techniques borrowed from the tag clouds of Web 2.0. Highlighting the important words in a document by using a larger font size allows to get a quick impression of the relevant concepts in a text. As this process does not depend on a user query it can also be used for explorative search. A user study showed, th…

Information retrievalProcess (engineering)Computer sciencemedia_common.quotation_subjectDocument clusteringUser requirements documentWorld Wide WebPerceptionRelevance (information retrieval)Tag cloudtf–idfΤεχνικές υπηρεσίες σε βιβλιοθήκες αρχεία και μουσείαTechnical services in libraries archives and museumsWord (computer architecture)media_common
researchProduct

ProcMiner: Advancing Process Analysis and Management

2007

This paper contributes both to research and practice on process mining. Previous research on process mining has focused on mining patterns from event log files to generate process models. The process mining approach adopted in this paper is focused on producing patterns about process models, not the models themselves. The approach is demonstrated by ProcMiner -an explorative research prototype for management, consolidating, publishing, retrieving, and analyzing process models. Content-based document clustering is applied to process models represented as XML database in order to find topical groups from models. In practice, organizations face numerous challenges in managing their process mod…

klusterointiProcess modelingprosessitiedon analysontiEvent (computing)Computer sciencecomputer.internet_protocolprocess miningProcess miningDocument clusteringXMLcomputer.software_genreData scienceprosessijohtaminendocument clusteringConsistency (database systems)XML databaseQuality management systemprosessien hallintacomputerXML
researchProduct